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Abstract 

Aims. The main goal of this work is to explore which elements carry the most information about the hirth origin of stars and as such 
that are best suited for chemical tagging. 

Methods. We explored different techniques to minimize the effect of outlier value lines in the abundances by using Ni abundances 
derived for 1111 FGK type stars. We evaluated how the limited number of spectral lines can affect the final chemical abundance. Then 
we were able to make an efficient even footing comparison of the [X/Fe] scatter between the elements that have different number of 
observable spectral lines in the studied spectra. 

Results. We found that the most efficient way of calculating the average abundance of elements when several spectral lines are 
available is to use a weighted mean (WM) where as a weight we considered the distance from the median abundance. This method 
can be effectively used without removing suspected outlier lines. We showed that when the same number of lines is used to determine 
chemical abundances, the [X/Fe] star-to-star scatter for iron group and a-capture elements is almost the same. On top of this, but at a 
lower level the largest scatter was observed for A1 and the smallest for Cr and Ni. 

Conclusions. We recommend caution when comparing [X/Fe] scatters among elements that have a different number of spectral lines 
available. A meaningful comparison is necessary to identify elements that show the largest intrinsic scatter and can be thus used for 
chemical tagging. 

Key words, stars: abundances - stars: general - stars: fundamental parameters 


1. Introduction 

erful tool to identify stellar groups and clusters (e.g.|Tabemero 


et al. 

12014) |De Silva et al.))2013) )Spina et al.))2014a|b) )Quillen 

Studies of large samples of stars are very important for 

et al. 

2015)1 and even to identify solar siblings (e.g. Batista et al. 

understanding the Galactic and stellar chemical evolution. 

2014 

[Ramirez et al. 12014) [Liu et al.|2015)|. 


Understanding the effects of these two mechanisms is, in turn, 
crucial for the studies of chemical properties of individual stars. 
A representative example is the so-called Tc-trend: a trend of 
chemical abundance with the condensation temperature of the 


elements, whose real nature is still under debate (e.g. Melendez 

et al. 

2009llRamIrez et al. 120091 [Gonzalez Hernandez et al. 120101 

2013 

Schuler et al.j2011[ Adibekyan et al.|2014) Onehag et al. 

2014 

Maldonado et al.|2015)|Nissen|2015)l. 


Precise and detailed chemical composition studies of large 
samples of stars are also of great importance for different 
venues of Galactic astronomy. One of these venues goes to¬ 
wards a so-called chemical tagging technique: identifying stars 
with identical chemical properties. This technique was intro- 


duced by 
plored anc 

Freeman & Bland-Hawthorn 

(]2002)), and then ex- 

developed 

Dy many other authors (e.g. De Silva et al. 

2006) Tabernero et al. 

2012 m4| Mitschang et al.|2013 2014 

Blanco-Cuaresma et a 

.|2015|l. Chemical tagging is a very pow- 


In all likelihood, not all elements are equally useful for 
chemical tagging. A way of selecting the elements that can be 
used to tag stars is to look at the star-to-star [X/Fe] abundance 
ratio scatter at solar metallicities, where the Galactic chemi¬ 
cal evolution does not have a very strong effect. Elements that 
show largest star-to-star scatter are the more informative, being 
of physical origin. 

(|2006 [2007 1 2009 on open clus- 


The works oflDe Silva et al. 


et al. 


the 


olj 

of|] 


ters and those oflRamlrez et al. 


(2009 1 and|Gonzalez Hernandez] 


( 2010) 1 for solar-twins/analogs clearly show that most of 
ements show very small star-to-star [X/Fe] scatter. These 
authors performed a fully differential chemical abundance anal¬ 
ysis in a line-by-line basis with respect to a solar spectrum refer¬ 
ence, as well as to a star which is expected to belong to a given 
open cluster or kinematical group. In particular, in the recent 
work of Ramirez et al. (20141, a higher weight/priority to Na, 
Al, V, Y, and Ba were given for chemical tagging. However, in 
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all the mentioned studies when the star-to-star [X/Fe] scatters 
were compared for different elements, an important parameter 
was not taken into account: the number of spectral lines used to 
derive abundances for each element. 

In this work, using a large and high-quality data of solar-type 
stars, we study the dependence of [X/Fe] scatter on the number 
of spectral lines. This allows us to make a comparison on the 
same ground of the [X/Fe] scatter for different elements by using 


the same number of lines. Our sample comes from Adibekyan 


et al. (20121 and consists of 1111 FGK-type dwarfs observed 
with the high-resolution HARPS spectrograph. The stellar pa¬ 
rameters and abundances of the stars were derived from the high 
signal-to-noise ratio (SNR) spectra with a median SNR of 235 
(only 15% of the spectra have SNR < 100). 

We organize our paper as follows.In Appendix[Al we quan¬ 
tify the precision in the abundance value as a function of number 
of spectral lines and summarize the results in Sect.|^ The discus¬ 
sion on the [X/Fe] star-to-star abundance scatter and conclusions 
are presented in Sects[^and|^ 


2. Reducing the impact of outiier iines 


The data used in this work was taken from Adibekyan et al. 
( 2012| l, which provides chemical abundances for 12 iron-peak 
and a-capture elements (15 ionized or neutral species). In the 
present paper we did not use the hnal (average) abundances of 
different elements, but instead we used the abundances derived 
from individual lines of each element. As stated previously, this 
is because we aim at studying the dependence of precision in 
abundances on the number of lines. 

A standard, and widely used technique to calculated chemi¬ 
cal abundances derived from several spectral lines is to apply an 
outlier removal criteria and then calculate the arithmetic mean 
(AM) of the abundances from the remaining lines. However, 
the detection of outliers is not an easy task. There are sev¬ 
eral outlier removal methods discussed in the literature (e.g. 
cr-clipping (e.g. Shiffler|[l988| l, modihed Z -score (|Iglewicz~^ 
Hoaglin \993), Tuke y’s (boxplot) method ( |Tukey| 1977) 1, and 
median-rule (Carling |1998| l), however most of them are model 
dependent while depending on the applied threshold for which 
there is no clear prescription or theoretical ground. It is appro¬ 
priate to note, that outlier removal is not the only method used to 
characterize an underlying distribution in a dataset. An example 
is the weighted least-squares regression to minimize the effects 
of outlier data ( Rousseeuw & Leroy|1987| l. 

In Appendix|A| we present an comprehensive discussion 
about different outlier methods and a new WM method where as 
weight we use (inverse) distance from the median value as mea¬ 
sured in units of standard deviations (SD). We made several tests 
to evaluate the impact of outliers on the chemical abundances 
(using Ni for our tests), and the dependence of the precision of 
the abundances on the number of lines. 

Our tests showed that when the number of lines is large, dif¬ 
ferent outlier removal techniques and criteria provide similar fi¬ 
nal abundances. However, the line-to-line dispersion, which is 
usually used to estimate the error on abundances, strongly de¬ 
pends on the criteria and can be artificially (unrealistically) re¬ 
duced depending on the outlier removal method and threshold. 
We conclude and recommend to use the WM (instead of any out¬ 
lier removal technique) when several lines are available at hand. 

We found that even for solar-type stars for which high- 
quality data is available, signihcant deviations in abundances 
from the real value are possible when the number of lines is 
small. 



Elements 


Figure 1. [X/Fe] star-to-star scatter for solar-analogs with solar- 
metallicity. The dashed lines, which represent [X/Fe] = 0.03 and 
0.06 dex, are just to make the comparison of the [X/Fe] scatters 
between the elements visually easier. 


We refer the interested reader to Appendix[A]for the details 
of the tests and discussion. 


3. [X/Fe] star-to-star scatter 


Recently, several works on solar analogs (e.g. Ramirez et al. 
|2009) [Gonzalez Hernandez et al.||2010| |2013| l, showed that the 
[X/Fe] versus [Fe/H] trends show very small star-to-star scat¬ 
ter at the solar metallicities for most of the elements. In Fig. [T] 
we plot [X/Fe] scatter irms) for dwarf stars (loggf > 4 dex) 
that have effective temperatures within 300 K of that of the Sun 
(Teff ^ 5777 + 300 K) and have metallicities in the range of 
[Fe/H] = 0.0+0.05, [Fe/H] = 0.0+0.10, and [Fe/H] = 0.0+0.20 
dex, respectively. The abundances of all the elements were de¬ 
rived using all the available lines by applying the WM technique. 
We selected only stars with solar metallicities to minimize the 
effect of Galactic chemical evolution and the thin/thick disk di¬ 
chotomy (however see the discussion in Adibekyan et al.|201 1 


2013) about thin/thick disk separation at solar and super-solar 


metallicities). The constrain on ^eff serves to select the stars 
with the highest precision of the stellar parameters and chemical 
abundances ( Sousa et al.|2008[ Adibekyan et al.|2012[ [Tsantaki 
et al.|2013]l. We note that the sample size is large enough to min¬ 


imize the errors related to the sampling of the population. For 
example, if the scatter (standard deviation) is of about 0.05 dex 
(which is the case for most of the elements), the 95% conhdence 
interval of this value would be from 0.045 to 0.056 for the sam¬ 
ple size of 152 (the number of stars in the metallicity range of 
0.0+0.10 dexQ 

Fig. [3 shows that the highest scatter is observed for Na and 
Al, and the [X/Fe] scatter for Si, Ca, Cr, and Ni is the lowest. The 
number of available lines that were used to derive abundances of 
Na and Al is the lowest: only two lines, while elements show¬ 
ing the smallest scatter usually have more than 10 lines. From 
the hgure, it is apparent that the scatter does not change much 
when different metallicity intervals are used. The only exception 
is Mn where scatter increases with the width of the metallic- 


* The confidence interval of SD can be calculated as presented in 
Sheskin|(|2007b. 
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ity interval. We note, that for the derivation of Mn abundances 
we did not consider hyperhne structure (hfs), which is important 
for odd-Z elements and, if not considered might overestimate 
the Mn abundances deduced from a given EW. This can be one 
of the reasons for the observed increase of [Mn/Fe] scatter with 
metallicity. Another reason for the observed high scatter at larger 
range of [Fe/H] could be the strong Galactic evolution trend in 
the [Mn/Fe] - [Fe/H] plane at solar metallicities (e.g. [Adibekyan 
et al.||2012) Battistini & Bensb^|2015 1. We note, that the trend 
is strong even if the hfs effect, but not a non-FTE effect is taken 
into account (e.g. Battistini & Bensby|2015| l. 

To evaluate the impact of the number of lines (e.g. precision) 
on the [X/Fe] scatter we did a test similar to that presented in 
the previous section (see AppendixjA] for the details). For each 
element, we randomly drew N number of lines (where N is from 
one to the maximum number of lines) and calculated the [X/Fe] 
scatter for solar analogs in the metallicity range of 0.0+0.10 dex. 
If the number of possible combinations of the lines is less than 
1000 we considered all of them, else we limited ourselves to 
fixed number of 1000 random (but different i.e., without replace¬ 
ment) combinations. 

In Fig. B.l we plot the dependence of [X/Fe] star-to-star 
scatter as a function of the number of lines that were used for 
[X/H] abundance derivations. The plot clearly shows that the av¬ 
erage scatter decreases with the number of lines. The plot also 
shows that the WM always gives smaller scatter than the AM (of 
course, when the number of lines is larger than two). This fact 
can be considered as an independent confirmation of the better 
“precision” of the abundances calculated using WM technique. 

It is also very interesting to note that some individual lines 
can introduce a very large scatter while others provide very small 
one. The results of this test can be used to rank the spectral lines 
according to the [X/Fe] scatter they provide. This can be used as 
a “new” method to eliminate outliers and select the best possible 
lines. For example, there is one Ca line (/15261.71) that clearly 
shows a very large scatter (0.16 dex) compared to the rest of 
12 lines (on average aiO.06 dex). It is interesting and important 
to note, that the average difference of the [Ca/H] abundance de¬ 
rived by using this line from the mean abundance is very small, 
but again with a large dispersion <A[Ca/H]> = 0.05+0.15 dex, 
which means that the line does not show systematically higher 
or lower abundance when compared to that derived with the re¬ 
maining lines. If all the 1111 stars are considered, then this dif¬ 
ference becomes smaller, and negative <A[Ca/H]> = -0.02+0.15 
dex. Our analysis of the [Ca/Fe] versus '^eff for this line shows a 
very weak trend (0.05 dex/lOOOK) and particularly large scatter 
at low temperatures. However we found that the deviation of the 
Ca abundance of this line from the mean Ca abundance depends 
on the EW. The average EW of this line is 104+23 and 116+45 
mA for the solar analogs and for all the stars, respectively. When 
the EW is greater than 100 mA, the deviation increases signifi¬ 
cantly. 

Similar to the discussed Ca line, we found some lines for 
the other elements that show distinguishably large dispersion. 
We provide the ranked list of all the lines ordered by the scatter 

sizq3 

Using the [X/Fe] scatter for all the elements and for differ¬ 
ent number of lines, we compared the [X/Fe] star-to-star scatter 
for different elements using the same number of lines in Fig. 
We plot the scatter derived using 2, 6, 13, and 20 lines because 
these numbers are those that better match the number of lines 
available: maximizing the number of lines and elements in the 


panels. For example, all the elements have at least two lines, and 
there is only one element that has a number of lines between 6 
and 13. 

The top-left panel of Fig. shows that when only two lines 
are used for all the elements to calculate [X/Fe], almost all the 
elements show a similar scatter of about 0.06 dex. Aluminum 
shows the largest, and Cr and Ni show the smallest scatters. 
However, one can also see that depending on the combinations 
of lines the [X/Fe] scatter can be different for the same element 
(the error bar in the plot). The other three panels, that provide in¬ 
formation which is more statistically signihcant since it is based 
on larger number of lines, show that from elements that have at 
least six lines, Ti, V, Sell, and Co show the largest scatter. Again, 
Cr and Ni show the smallest scatter. We note that although the 
obtained differences in [X/Fe] scatter between elements are not 
large, they are based on a large sample and thus can be consid¬ 
ered statistically signihcant. 

The decrease of the [X/Fe] scatter with the number of lines 
means that a fraction of the observed scatter does not have astro- 
physical origin. Table 3 of jAdibekyan et aL| ( |2012| l provides the 
average error of the [X/Fe] ratios for the same sample of stars. 
The table shows that the average error varies from 0.01 to 0.03 
dex. For the elements that have at least 13 lines (Sil, Cal, Til, 
CrI, and Nil) the average error on [X/Fe] is 0.01 dex. 

Our results show the importance of the initial selection of 
the lines, especially when the number of lines is small. By care¬ 
fully selecting lines for individual stars with a given set of stellar 
parameters and a given quality of the spectra, one can derive 
precise chemical abundances even when the number of lines is 
small and have small [X/Fe] scatter, as already demonstrated by 
e.g., Ramirez et al. (20091 and Gonzalez Hernandez et al. ( |2010| l 
for solar analog stars. However, when dealing with large num¬ 
ber of stars with different combinations of stellar parameters and 
quality of the spectra, it is not realistic to control abundances of 
each individual line in each individual star. 


4. Summary and conclusion 


The table is available at the CDS. 


In this paper, we used a large sample of FGK stars ( [Adibekyan 
et al. |2012[ ) to study the dependence of precision of chemical 
abundances on the number of lines and how it affects the [X/Fe] 
star-to-star scatter at solar metallicities. We explored different 
techniques to calculate the mean abundance and minimize the 
effect of possible outliers when several spectral lines are avail¬ 
able for an element. 

From our tests we conclude and recommend to use the WM 
(instead of any outlier removal technique) when several lines are 
available at hand. As a weight, the distance from the median 
abundance can be effectively used, as demonstrated. 

Selecting only solar-analogs with metallicities similar to that 
of the Sun by 0.10 dex, we showed that [X/Fe] scatter strongly 
depends on the number of lines suggesting that one should be 
cautious when comparing star-to-star abundance dispersion of 
elements which abundances were derived using different num¬ 
ber of lines. The decrease of scatter with the number of lines 
suggests that some fraction of the observed scatter has non- 
astrophysical nature. A large number of lines is needed to reduce 
the precision induced scatter. 

The comparison of the [X/Fe] scatter for different elements 
using the same number of lines show that most elements show a 
very similar dispersion. The largest scatter among the elements 
studied in this work was found for Na, Al, Ti, V, Sell, and Co, 
while Cr and Ni show the smallest scatter. The similarity and 
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Figure 2. [X/Fe] star-to-star scatter for solar-analogs with solar-metallicity. The [X/Fe] scatter is derived by using 2, 6, 13 and 20 
lines. The error bars show the standard deviation of the scatter calculated using different combinations of lines. The dashed lines, 
which represent [X/Fe] = 0.03 and 0.06 dex, are just to make the comparison of the [X/Fe] scatters between the elements visually 
easier. 


differences in [X/Fe] scatter between the elements have differ¬ 


ent/similar nucleosynthesis production sites (see e.g. Nomoto 
let al.|2013r 


Our group is currently working on the derivation of abun¬ 
dances of volatile (C and N) and r- and s-process elements 
(Suarez-Andres et al, in prep; Delgado-Mena et al, in prep). 
When the data is ready a similar analysis will be done for these 
elements to select the elements that are the most informative for 
chemical tagging. 
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Appendix A: Dependence of abundances on the 
number of lines: the case of Ni 


Table A.l. The difference in Ni abundances when WM method 
and other methods are applied for the derivation of Ni. 


The data used in this work was taken from Adibekyan et al. 
( 2012| l, which provides chemical abundances for 12 iron-peak 
and cr-capture elements (15 ionized or neutral species). 

The lines used in this work are based on the line-list of 
Neves et al. ( 2009 1 . From the VALEj^online database, 180 lines 


were carefully selected in the solar spectrum to be: not-blended, 
have Equivalent Widths (EW) above 5 mA and below 200 mA, 
be located outside of the wings of very strong lines. Later on, 
the semi-empirical oscillator strengths for the lines were calcu¬ 
lated by calibrating the log gf values to the solar reference of 


Anders & Grevesse ( 1989|l. Moreover, only “stable” lines which 


do not show high abundances dispersion (i.e., 1.5 times the rms) 
from the mean abundance for each element were selected. In 
this later test, 451 stars with wide range of stellar parameters 
and SNR were used. The selected 180 lines, were re-checked 
in Adibekyan et al. ( 2012| l, where several lines were excluded 
because of the observed abundance trend [X/Ee] with the effec¬ 
tive temperature. Eor more details about the selection of the lines 
we ref er the reader to |Neves et~ar ( |2009| l andJ Adibekyan et ak 
( 2012| l. Our final line-hst consists of 164 line^^^ 

We stress that the main goal in this work is not to re-check 
the quality of the lines, nor to provide a range of parameter space 
(stellar parameters and SNR) where each individual line can be 
safely and reliably used. Since different authors use different set 
of spectral lines and different atomic data for the lines, for us it 
is more straightforward and scientifically interesting to discuss 
methods that can, in principle, effectively work for different line- 
lists and when applied on large datasets, as it is offen the case. 


A.1. Comparing methods 


In [Adibekyan et al. ( |20I2| l the final abundance for each star 
and element was calculated as the arithmetic mean (AM) of the 
abundances given by all lines detected in a given star and ele¬ 
ment after a 2-sigfma-clipping was applied. This is a standard, 
and widely used technique that allows to avoid the errors caused 
by bad pixels, bad measurements, cosmic rays, and other un¬ 
known localized effects. However, this type of “outlier” removal 
technique depends on the threshold (2-cr in our case) that is 
applied for which there is no clear prescription, or theoretical 
ground, and the choice ends up being very subjective. The choice 
of threshold should also depend on the sample size. A simple 
demonstration of this sample size dependence is presented by 
Shifiler ( 1988|l, who showed that the possible maximum Z-score 


(number of SD a data-point is far from the mean) depends (only) 
on the sample size and it is computed as (n-l)/Vn. Erom this for¬ 
mula we get, that the maximum deviation one can obtain in a 
sample of 5 points (lines) is 1.79-(T, i.e., no outliers can be iden¬ 
tified in the data if 2-(T-clipping is applied. One can alternatively 
use median and median absolute deviation (MAD), which is ex¬ 
pected to be less sensitive to outliers, or apply other outlier re¬ 
moval methods (e.g. [Hodge & Austin|2004 Iglewicz & Hoaglin 
|1993j l. However, it is very difficult to choose a single method and 
a criterion that will efficiently work for samples of different size. 
Moreover, when a certain criterion is applied to remove possi¬ 
ble outliers, some valid lines from the real distribution can be 


^ Vienna Atomic Line Database _ 

This line-list was subsequently analyzed in Adibekyan et al. (2015 


Methods 

Threshold 

ANi (dex) 

WM-AM 

- 

-0.0020±0.0096 

WM - Median-rule 

median±2IQR 

median±2.5IQR 

median±3IQR 

0.0001±0.0052 

-0.0002±0.0057 

-0.0006±0.0062 

WM - cr-clipping 

median±2SD 

median±2.5SD 

median±3SD 

0.0003±0.0049 

-0.0002±0.0054 

-0.0004±0.0062 

WM - MAD, 

median±2.5MAD, 

median±3MAD, 

median±3.5MAD, 

0.0002±0.0049 
4.5x10-^ ±0.0057 
-0.0006±0.0062 

WM - MADf'' 

median±2.5MAD, 

median±3MAD, 

median±3.5MAD, 

0.0004±0.0055 

0.0001±0.0052 

-0.0002±0.0057 


to select a sub-list of lines suitable for abundance derivation for cool, 
evolved stars 


removed as well. Einally, one should also bear in mind that most 
of the outlier removal methods are model-dependent assuming 
some distributions for the real and outlier data. 

To explore and choose the method that allows to derive the 
most precise final abundances of the elements, we selected Ni for 
our analysis because it has the largest linelist (43 lines). By plot¬ 
ting the individual Ni abundances in the full sample of 1111 stars 
we noticed that many of the stars have Ni lines which show devi¬ 
ation from the average value by more than 3-cr. To understand if 
these lines are outliers or just extremes of the distribution (a nor¬ 
mal distribution is assumed here) we performed some simple cal¬ 
culations. If one assumes a normal distribution, then a 3-cr cor¬ 
responds to P = 0.003 probability. Since on average we derive Ni 
abundance from 43 lines, then the probability that we will have 
at least one “outlier” is of 43x0.003=0.129. This means that 
among the 1111 stars we expect to have about 0.129x1111 = 143 
stars with one “outlier” line. However, the number of stars which 
have at least one “outlier” is 626. To estimate the probability of 
having that many stars with at least one “outlier” we used bino¬ 
mial probability distribution. The probability that more than 200 
stars (any number above 200) can have an “outlier” is already 
8x10“^. This means that, under our assumption of Gaussian dis¬ 
tribution of the abundances derived from different lines, some of 
the lines which show large dispersion (> 3-cr) can be real out¬ 
liers of different origin and are not just coming from the wings 
of the Gaussian distribution. We note, that the results of this test 
do not depend on the applied threshold (3-cr in this case). 

Eortunately, if the linelist is large, the possible outliers do 
not affect much the final (mean) abundance. We first tested 
three different outlier removal methods, namely cr-clipping (e.g. 
|Shiffier|1988| l, modified Z-score ( jlglewicz & Hoaglin|1993| l, and 
median-rule ( |Carling|[l998| l on our data. The modified Z-score 
method is similar to nx cr-clipping, but instead of mean and SD, 
the median and MADj^are used. Median-rule is a modification 
of Tukey’s (boxplot) method (Tukey 1977) 1 and defines outliers 
as points that lies further than median+kxIQR, where IQR is 
the interquartile range. We varied the k to {2,2.5,3) for the cr- 
clipping and median-rule, and k = {2.5,3,3.5) for the modified 
Z-score. The selected values of k are within the intervals sug¬ 
gested in the above cited references. 

A potential difficulty that one faces when trying to remove 
outliers is the so-called masking and swamping effects - re¬ 
moval of one outlier changes the “status” of the other data points 


MADj = 1.483xMAD, and is equal to SD for large normal data 
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Figure A.l. Difference in [Ni/H] and when WM and AM without outlier removal methods are applied (left). The same as in 

the left panel, but the parameters are derived using WM and MAD"*’'^ (with the threshold of median+3MAD) techniques. 


HD66040, Teff=5226 K, logg=4.34 dex, [Fe/Hl=0.35 dex, SNRj: 143 

2 lines; = o.ioi 


10 lines; = 0.028 



A[Ni/H] (dex) 


HD59468, Teff=5618 K, logg=4.39 dex, [Fe/H]=0.03 dex, 5NR= 2020 


150 


2 lines: = 0.018 

10 lines; = 0.006 

30 lines; = 0.003 



).10 - 0.08 - 0.06 


- 0.04 - 0.02 0.00 
A[Ni/H] (dex) 


0.02 0.04 0.06 


Figure A.2. The difference between original Ni abundance and Ni abundances derived with only 2, 10, and 30 Ni lines. 


( Acuna & Rodriguez [2004 1 . This means that it is advisable to re¬ 
move one outlier at the time and apply the criteria recursively. 
However, it is not obvious when the outlier removal criteria 
should be stopped (the problem exists also when the outliers 
are removed at once). Two approaches were considered in our 
tests when modified Z-score method was used: i) remove all the 
outliers at once (we call it MAD^ technique in the remainder of 
the paper), and ii) remove one outlier at a time and then apply 
the criterion again iteratively (hereafter we call it MAD"'^'' tech¬ 
nique). For the second approach we allowed maximum number 
of 10 iterations, although in most of the cases, a lower number 
of iterations were needed (depending, of course, on the threshold 
accepted). 

Outlier removal is not the only method used to character¬ 
ize an underlying distribution in a dataset. An example is the 
weighted least-squares regression to minimize the effects of out¬ 
lier data (Rousseeuw & Leroy|1987|l. 


The last method that we use to calculate the final abundance 
and its line-to-line scatter is the WM and weighted SD. As a 
weight we used the (inverse) distance from the median value in 
terms of SD and then binned it. Using MAD and SD in the cal¬ 
culations of the weight on the average give very similar results, 
but if the values of more than the half of the points (lines) are 
the same (this can happen when the number of lines is small), 
then MAD is by definition zero, and cannot be used to calculate 


the weight. Since the distance of the median point from the me¬ 
dian is zero, the weight of that line would be infinite. To avoid 
giving a very high weight to the points that are initially close to 
the median (the final value would by construction be very close 
to the median), we decided to bin the distances with an interval 
of 0.5SD. E.g., a 0.5xSD weight was given to the lines that are 
at the distance from 0 to 0.5 SD from the median. Similarly, a 
IxSD weight was given to the lines lying at the distances of 0.5 
to IxSD, and so on. 

The results of our tests are summarized in the Table lA. ll The 
test showed that all the outlier removal methods give a mean, fi¬ 
nal abundance similar to the one of the WM. Since the number of 
lines is relatively large, the impact of possible outliers is small, 
and all the values were also similar to the abundance calculated 
by the AM of all the points. However, we note that when the 
lowest thresholds were set to remove outliers, some stars due to 
the large number of removed “outliers”, showed deviations in 
the final abundance from the mean abundance derived from dif¬ 
ferent methods. Another important point to stress is that when 
outlier removal methods were applied with low thresholds, the 
line-to-line scatter (which is usually used as an error estimate of 
the final abundance) was usually small, as expected. 

From these tests (and further tests presented next in this 
work), we concluded that the best way to calculate the final abun¬ 
dance and its error is to use the WM. In this case, the weight of 
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Figure A.3. Average deviation from the original Ni abundance for 1111 stars versus number of lines that were used for the abun¬ 
dance derivations. The right panel is the zoom of the left plot, only limited to six lines. Different techniques that were used in the 
calculations are mentioned in the plots. 
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Figure A.4. Dependence of the average deviation from the original Ni abundance for 1111 stars versus stellar parameters and SNR. 
The abundance deviation represents the difference between original Ni abundance and Ni abundance derived only with 2 lines. 


real outliers (extremes) is small, and the final abundance is not 
affected. With this approach, we also do not reduce the scatter 
by artificially removing points from the distribution. In Fig. |A.l| 
we plot the distribution of the Ni abundance and its error (line- 
to-line dispersion) differences when WM and AM method is ap¬ 
plied (left plot), and when WM and MATi'’/'' (median±3MAD) 
criteria is applied (right plot). From the plot and table it is very 
clear that when the number of lines is large, different outlier re¬ 
moval (or not) methods provide very similar results for Ni abun¬ 
dances, however the error associated to these values depends on 
the method. In particular. Fig. |A.1| shows (left plot) that line-to- 
line scatter of [Ni/H] is always larger when the Ni abundance is 
calculated by the AM than when the WM method is used (the 
difference in cr[^,y;/] is always positive). The right panel of the 


same figure, shows that the difference in cr[NiiH] when Ni abun¬ 
dance is calculated by WM and MAD"'''^, is usually small and 
can be both positive and negative. 

A word of caution should be added at this point. In the meth¬ 
ods that we tested to remove “outliers” and in the WM technique 
we assume that the distribution of the abundances (or the distri¬ 
bution of the errors on abundances) is symmetricj^ However, as 
it was shown in Bertran de Lis et aT] ( |2015| l for very weak lines 
with an assumption of LTE (local thermodynamic equilibrium) 
the distribution of uncertainties of abundances is asymmetric. 


® Note that for the WM method there is no assumption on the normal¬ 
ity of the distribution of the errors of abundances, while some outlier 
removal methods based on this assumption. 
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The authors also showed that this effect depends on the SNR 
and is negligible for lines with EW greater than 8mA regardless 
of SNR. However, since in all the methods are based on the same 
hypothesis, the WM technique remains favorable for us. 


A.2. Abundance precision dependence on the number of 
iines 

To evaluate the impact of the number of lines (that one uses for 
abundance derivations) on the abundances, we did the following 
simple tests. For each star in the sample, we randomly drew N Ni 
lines (N - 2,3,...,42) and calculated the Ni abundance. We used 
the above mentioned WM technique for the calculation of the 
abundances. Then we compared the resulting abundances with 
the supposed Ni abundance value (derived by using all the 43 
lines available and the WM technique). If the number of possible 
combinations is less than 1000, we considered at all the possible 
combinations of lines, otherwise we drew A=1000 random, but 
different combinations of linefl 


In Fig. A.2 we plot an example (for two stars) of the dis¬ 
tribution of the differences in Ni abundances (A[Ni/H]) when 
three different number of Ni lines (2, 10, and 30 lines) and all 
the available lines are used. The stars have different stellar pa¬ 
rameters and different SNR in the spectra. The plot shows that 
when the number of lines is increased the abundance difference 
gets smaller. It also shows that while most of the cases/trials the 
A[Ni/H] is close to zero, it is possible to obtain very large dif¬ 
ferences when only two lines are used (even for very high SNR 
data). 

We did the aforementioned computations for all the 1111 
stars and for each number of lines we calculated the standard 
deviation of A[Ni/H] distribution - cr^e„. 

In Fig. |A.3[ we plot the dependence of the average of the 
Wdev for all 1111 stars as a function of the number of lines. In 
the plot, we only limited ourselves with examples of four tech¬ 
niques with different thresholds in order not to overload the fig¬ 
ure, while applying all the techniques and thresholds presented 
in Table |A.1| Moreover, since in these tests the size of the sam¬ 
ple (lines) varies, we decided to test also lower outlier removal 
thresholds: k = 1.5 for cr-clipping and median-rule methods, and 
k=2 for modified Z-score methods). 


Fig. A. 3 shows the range of possible deviations (Icr devia¬ 


tion if the distribution was a Gaussian) from the original value 
for a given random star when a randomly draw N lines are used. 
It clearly shows that the deviation decreases very steeply with 
the number of lines and becomes smaller than 0.01 dex when 
more than 15 lines is used. 

On the right panel of Fig. |A.3[ we show that, for a number 
of lines less than or equal to six, there is a subtle difference be¬ 
tween different outlier removal techniques. It clearly shows that 
the smallest average deviation is obtained when the WM is used. 
We note, that other tests with different thresholds show similar 
results. The low thresholds for outlier removal techniques give 
results closer to that obtained by using WM for small number of 
lines. However, when low thresholds are considered for a large 
number of lines, due to high number of excluded lines, the fi¬ 
nal results deviate from the abundances obtained by using WM 
method. For the remainder of the paper we use abundances cal¬ 
culated by the WM method, if another method is not specified. 
Here we should stress again that we plot the possible deviations 


of Ni abundances averaged for 1111 stars. While these average 
values are small, the deviations for individual stars can be very 
significant (as demonstrated in Fig. A.2i. 

It is natural to expect that the observed deviations should de¬ 
pend chiefly on the quality of the data (e.g. SNR) and also on the 
atmospheric parameters of the stars. This is because e.g. spectral 
lines in cooler stars spectra are usually more blended, and also 
because e.g. different lines form at different layers of the atmo¬ 
spheres and have different sensitivities to the non-LTE effects. In 
Fig. |A.4| we plot for the case in which only two lines were used 
the dependence of the average on the stellar atmospheric 
parameters and on the SNR. The plot shows that there is only 
a strong and clear dependence on Teff- This result is expected 
since at low temperatures the spectra of cool stars are crowded 
and line blending plays a stronger role. Lowest metallicity stars 
and stars with the lowest SNR also show somewhat larger de¬ 
viations. It is interesting to note that even if the SNR is very 
high, depending on stellar parameters, it is possible to obtain a 
Ni abundance up to 0.1 dex different from the original abundance 
when only two Ni lines are used. 


Appendix B: [X/Fe] star-to-star scatter: dependence 
on the number of lines 


’ We note that 1000 is a sufficiently high number of combinations 
and our tests showed that increasing this number by a factor of 100 has 
negligible impact on the results. 
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Figure B.l. Dependence of [X/Fe] star-to-star scatter for solar-analogs with [Fe/H] = 0.0+0.10 dex on the number of lines. Red 
triangles show the scatter when the individual abundances are calculated as an AM and the blue squares indicate the scatter in 
[X/Fe] when the WM method was used for the abundance derivation. The black dots show the [X/Fe] scatter for each individual 
line that was used to derive [X/H]. The error bars indicate the dispersion of possible combinations of the lines. 







































